Method and apparatus for implementing motion scalability

ABSTRACT

An apparatus and method for improving the multi-layered motion vector compression efficiency of a video coding method by efficiently predicting a motion vector in an enhancement layer from a motion vector in a base layer. The apparatus includes a base layer determining module that determines motion vector component of a base layer having the base layer pixel accuracy using the obtained motion vector, and an enhancement layer determining module that determines a motion vector component of an enhancement layer having the enhancement layer pixel accuracy which is obtained motion vector.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority from Korean Patent Application No.10-2004-0032237 filed on May 7, 2004, in the Korean IntellectualProperty Office, and U.S. Provisional Patent Application No. 60/560,250filed on Apr. 8, 2004, in the United States Patent and Trademark Office,the disclosures of which are incorporated herein by reference in theirentirety.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a video compression method, and moreparticularly, to an apparatus and a method for improving the compressionefficiency of a motion vector by efficiently predicting a motion vectorin an enhancement layer from a motion vector in a base layer, in a videocoding method using a multilayer structure.

2. Description of the Related Art

The development of information technology (IT) such as the Internet hasincreased text, voice and video communication. Conventional textcommunication cannot satisfy the various demands of users, and thusmultimedia services that can provide various types of information suchas text, pictures, and music have increased. Multimedia data requires alarge capacity storage medium and a wide bandwidth for transmissionsince the size of multimedia data is usually large. Accordingly, acompression coding method for transmitting multimedia that includestext, video, and audio is necessary.

A basic principle of data compression is removing data redundancy. Datacan be compressed by removing spatial redundancy where the same color orobject is repeated in an image, or by removing temporal redundancy wherethere is little change between adjacent frames in a moving image or thesame sound is repeated in audio, or by removing visual redundancy takinginto account human eyesight and limited perception of high frequency.

Currently, most video coding standards are based on a motioncompensation estimation coding method. Temporal redundancy is usuallyremoved by temporal filtering based on motion compensation, and spatialredundancy is usually removed by spatial transform.

To transmit multimedia created after removing data redundancy,transmission media are necessary. Different types of transmission mediafor multimedia have different performance. Currently used transmissionmedia have various transmission rates. For example, an ultrahigh-speedcommunication network can transmit data at a rate of several megabitsper second while a mobile communication network has a transmission rateof 384 kilobits per second.

Accordingly, to support transmission media having various speeds or totransmit multimedia data at a rate suitable to a transmissionenvironment, data coding methods having scalability, such as waveletvideo coding and subband video coding, may be suitable to a multimediaenvironment.

Scalability refers to the ability to partially decode a singlecompressed bitstream at a decoder or a pre-decoder part. The decoder orpre-decoder can reconstruct multimedia sequences having differentquality levels, resolutions, or frame rates from only some of thebitstreams coded by a scalable coding method.

In a conventional video coding technique, a bitstream typically consistsof motion information (motion vector, block size, etc.) and textureinformation corresponding to a residual obtained after motionestimation.

In a conventional method for achieving texture scalability, wavelettransform and embedded quantization are used to implement spatialscalability and Motion Compensated Temporal Filtering is used to providetemporal scalability.

Another method for implementing texture scalability is to temporally orspatially construct texture information into multiple layers. Forexample, the texture information consists of multiple layers: i.e., abase layer, a first enhancement layer, and a second enhancement layer.To support spatial scalability, the respective layers have differentresolution levels: i.e., Quarter Common Intermediate Format (QCIF),Common Intermediate Format (CIF), and 2CIF. Signal-to-noise ratio (SNR)and temporal scalabilities are implemented within each layer.

In existing video coding schemes, motion information is usuallycompressed losslessly as a whole. However, the non-scalable motioninformation can significantly degrade the coding efficiency due to anexcessive amount of motion information, especially for a bitstreamcompressed at low bitrates. In order to solve this problem, research isbeing actively conducted to implement motion scalability. A method tosupport motion scalability is to divide motion information into layersaccording to relative significance and to transmit only part of themotion information for low bitrates with loss, giving more bits totextures. Motion scalability is an issue of great concern to MPEG-21PART 13 scalable video coding.

Recently, various approaches have been proposed for implementing motionscalability by constructing a motion vector into multiple layers. Theapproaches are divided into two categories: a partition-based approachand an accuracy-based approach.

The partitioned-based approach generates a multi-layered motion vectorby obtaining motion vectors for various resolutions in a frame with thesame pixel accuracy. The accuracy-based approach generates amulti-layered motion vector by obtaining motion vectors for variouspixel accuracies in a frame having one resolution.

The present invention proposes a method for implementing motionscalability by reconstructing a motion vector into multiple layers usingthe pixel accuracy-based approach. This method is focused on providinghigh coding performance for a base layer and an enhancement layersimultaneously.

SUMMARY OF THE INVENTION

The present invention provides a method for efficiently implementingmotion scalability using a motion vector consisting of multiple layers.

The present invention also provides a method for improving codingefficiency when using only a base layer at a low bitrate by constructinga motion vector into layers according to the pixel accuracy in such away as to minimize distortion.

The present invention also provides a method for improving codingperformance by minimizing overhead when using all layers at a highbitrate.

According to an aspect of the present invention, there is provided anapparatus for reconstructing a motion vector obtained at thepredetermined pixel accuracy including a base layer determining moduledetermining a motion vector component of a base layer using the obtainedmotion vector according to the pixel accuracy of the base layer, and anenhancement layer determining module determining a motion vectorcomponent of an enhancement layer that is close to the obtained motionvector according to the pixel accuracy of the enhancement layer.

The base layer determining module may determine the motion vectorcomponent of the base layer that is close to a value predicted frommotion vectors of neighboring blocks according to the pixel accuracy ofthe base layer.

In order to determine the motion vector component of the base layeraccording to the pixel accuracy of the base layer, the base layerdetermining module may separate the obtained motion vector into a signand a magnitude, may use an unsigned value to represent the magnitude ofthe motion vector, and may attach the original sign to the value.

The base layer determining module may determine a value closest to theobtained motion vector as the motion vector component of the base layeraccording to the pixel accuracy of the base layer.

The motion vector component x_(b) of the base layer may be determinedusing x_(b)=sign(x)└|x|+0.5┘ where sign(x) denotes a signal functionthat returns values of 1 and −1 when x is a positive value and anegative value, respectively, |x| denotes an absolute value functionwith respect to variable x, and └|x|+0.5┘ denotes a function giving thelargest integer not exceeding |x|+0.5 by stripping the decimal part.

The apparatus for reconstructing a motion vector obtained at apredetermined pixel accuracy may further include a first compressionmodule removing redundancy in a motion vector component of a firstenhancement layer among the enhancement layers using the fact that themotion vector component of the first enhancement layer has an oppositesign to the motion vector component of the base layer when the motionvector component of the first enhancement layer is not 0.

The apparatus for reconstructing a motion vector obtained at thepredetermined pixel accuracy may further include a second compressionmodule removing redundancy in a motion vector component of a secondenhancement layer using the fact that the motion vector component of thesecond enhancement layer is always 0 when the motion vector component ofthe first enhancement layer is not 0.

According to another aspect of the present invention, there is provideda video encoder using a motion vector consisting of multiple layers, theencoder including a motion vector reconstruction module including amotion vector search module obtaining a motion vector with thepredetermined pixel accuracy, a base layer determining moduledetermining a motion vector component of a base layer using the obtainedmotion vector according to the pixel accuracy of the base layer, anenhancement layer determining module determining a motion vectorcomponent of an enhancement layer that is close to the obtained motionvector according to the pixel accuracy of the enhancement layer, atemporal filtering module removing temporal redundancies by filteringframes in a direction of a temporal axis using the obtained motionvectors, a spatial transform module removing spatial redundancies fromthe frames from which the temporal redundancies have been removed andcreating transform coefficients, and a quantization module performingquantization on the transform coefficients.

According to still another aspect of the present invention, there isprovided an apparatus for reconstructing a motion vector consisting of abase layer and at least one enhancement layer, the apparatus including alayer reconstruction module reconstructing motion vector components ofthe respective layers from corresponding values of the layersinterpreted from an input bitstream, and a motion addition module addingthe reconstructed motion vector components of the layers together andproviding the motion vector.

According to yet another aspect of the present invention, there isprovided an apparatus for reconstructing a motion vector consisting of abase layer and at least one enhancement layer, the apparatus including afirst reconstruction module reconstructing a motion vector component ofa first enhancement layer by attaching a sign to a value of the firstenhancement layer interpreted from an input bitstream, which is oppositeto the sign of a corresponding value of the base layer, a layerreconstruction module reconstructing motion vector components of thebase layer and at least one enhancement layer other than the firstenhancement layer from values of the base layer and the at least oneenhancement layer interpreted from the input bitstream, and a motionaddition module adding the reconstructed motion vector components of thelayers together and providing the motion vector.

According to a further aspect of the present invention, there isprovided an apparatus for reconstructing a motion vector consisting of abase layer and at least one enhancement layer, the apparatus including afirst reconstruction module reconstructing a motion vector component ofa first enhancement layer by attaching a sign to a value of the firstenhancement layer interpreted from an input bitstream, which is oppositeto the sign of a corresponding value of the base layer, a secondreconstruction module setting a motion vector component of a secondenhancement layer to 0 when the value of the first enhancement layer isnot 0 and reconstructing the motion vector component of the secondenhancement layer from a value of the second enhancement layerinterpreted from the input bitstream when the value of the firstenhancement layer is 0, a layer reconstruction module reconstructingmotion vector components of the base layer and at least one enhancementlayer other than the first and second enhancement layers from values ofthe base layer and the at least one enhancement layer interpreted fromthe input bitstream, and a motion addition module adding thereconstructed motion vector components of the layers together andproviding the motion vector.

According to another aspect of the present invention, there is provideda video decoder using a motion vector consisting of multiple layers, thedecoder including an entropy decoding module interpreting an inputbitstream and extracting texture information and motion information fromthe bitstream, a motion vector reconstruction module reconstructingmotion vector component of the respective layers from correspondingvalues of the layers contained in the extracted motion information andproviding the motion vector after adding the motion vector components ofthe respective layers together, an inverse quantization module applyinginverse quantization to the texture information and outputting transformcoefficients, an inverse spatial transform module inversely transformingthe transform coefficients into transform coefficients in a spatialdomain by performing the inverse of spatial transform, and an inversetemporal filtering module performing inverse temporal filtering on thetransform coefficients in the spatial domain using the obtained motionvector and reconstructing frames in a video sequence.

The motion vector reconstruction module may include a firstreconstruction module reconstructing a motion vector component of afirst enhancement layer by attaching a sign to a value of the firstenhancement layer contained in the motion information, which is oppositeto the sign of a corresponding value of the base layer, a layerreconstruction module reconstructing motion vector components of thebase layer and at least one enhancement layer other than the firstenhancement layer from values of the base layer and the at least oneenhancement layer, and a motion addition module adding the reconstructedmotion vector components of the layers together and providing the motionvector.

In addition, the motion vector reconstruction module may include a firstreconstruction module reconstructing a motion vector component of afirst enhancement layer by attaching a sign to a value of the firstenhancement layer contained in the motion information, which is oppositeto the sign of a corresponding value of the base layer, a secondreconstruction module setting a motion vector component of a secondenhancement layer to 0 when the value of the first enhancement layer isnot 0 and reconstructing the motion vector component of the secondenhancement layer from a value of the second enhancement layer containedin the motion information when the value of the first enhancement layeris 0, a layer reconstruction module reconstructing motion vectorcomponents of the base layer and at least one enhancement layer otherthan the first and second enhancement layers from values of the baselayer and the at least one enhancement layer contained in the motioninformation, and a motion addition module adding the reconstructedmotion vector components of the layers together and providing the motionvector.

According to still another aspect of the present invention, there isprovided a method for reconstructing a motion vector obtained at thepredetermined pixel accuracy, the method including determining a motionvector component of a base layer using the obtained motion vectoraccording to the pixel accuracy of the base layer, and determining amotion vector component of an enhancement layer that is close to theobtained motion vector according to the pixel accuracy of theenhancement layer.

In the determining of the motion vector component of the base layer, themotion vector component of the base layer may be determined to be closeto a value predicted from motion vectors of neighboring blocks accordingto the pixel accuracy of the base layer.

In the determining of the motion vector component of the base layer, themotion vector component of the base layer may be determined according tothe pixel accuracy of the base layer by separating the obtained motionvector into a sign and a magnitude, using an unsigned value to representthe magnitude of the motion vector, and attaching the original sign tothe value.

In the determining of the motion vector component of the base layer, avalue closest to the obtained motion vector may be determined as themotion vector component of the base layer according to the pixelaccuracy of the base layer.

According to yet another aspect of the present invention, there isprovided a method for reconstructing a motion vector consisting of abase layer and at least one enhancement layer, the method includingreconstructing motion vector components of the respective layers fromcorresponding values of the layers interpreted from an input bitstream,and adding the reconstructed motion vector components of the layerstogether and providing the motion vector.

According to a further aspect of the present invention, there isprovided a method for reconstructing a motion vector consisting of abase layer and at least one enhancement layer, the method includingreconstructing a motion vector component of a first enhancement layer byattaching a sign to a value of the first enhancement layer interpretedfrom an input bitstream, which is opposite to the sign of acorresponding value of the base layer, reconstructing motion vectorcomponents of the base layer and at least one enhancement layer otherthan the first enhancement layer from values of the base layer and theat least one enhancement layer interpreted from the input bitstream, andadding the reconstructed motion vector components of the layers togetherand providing the motion vector.

According to still another aspect of the present invention, there isprovided a method for reconstructing a motion vector consisting of abase layer and at least one enhancement layer, the method includingreconstructing a motion vector component of a first enhancement layer byattaching a sign to a value of the first enhancement layer interpretedfrom an input bitstream, which is opposite to the sign of acorresponding value of the base layer, setting a motion vector componentof a second enhancement layer to 0 when the value of the firstenhancement layer is not 0 and reconstructing the motion vectorcomponent of the second enhancement layer from a value of the secondenhancement layer interpreted from the input bitstream when the value ofthe first enhancement layer is 0, reconstructing motion vectorcomponents of the base layer and at least one enhancement layer otherthan the first and second enhancement layers from values of the baselayer and the at least one enhancement layer interpreted from the inputbitstream, and adding the reconstructed motion vector components of thelayers together and providing the motion vector.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other features and advantages of the present inventionwill become more apparent by describing in detail exemplary embodimentsthereof with reference to the attached drawings in which:

FIG. 1 is a diagram for explaining a method of reconstructing amulti-layered motion vector according to the pixel accuracy;

FIG. 2 illustrates a method for improving the compression efficiency ofa motion vector according to a first embodiment of the presentinvention;

FIG. 3 illustrates an example of obtaining a predicted value for acurrent block by correlation with neighboring blocks;

FIG. 4 illustrates a third embodiment of the present invention;

FIG. 5 is a graph illustrating the results of measuring peaksignal-to-noise ratios (PSNRs) as a video quality indicator using motionvectors according to the first through third embodiments of the presentinvention.

FIG. 6A is a graph illustrating the results of measuring a PSNR whencompressing a Foreman CIF sequence at 100 Kbps according to the thirdembodiment of the present invention;

FIG. 6B is a graph comparing the experimental results of the thirdembodiment of FIG. 6A and the fourth embodiment of the presentinvention;

FIG. 7 is a block diagram of a video coding system;

FIG. 8 is a block diagram of a video encoder;

FIG. 9 is a block diagram of an exemplary motion vector reconstructionmodule according to the first embodiment of the present invention;

FIG. 10 is an illustration for explaining a process of obtaining amotion vector of an enhancement layer;

FIG. 11 is a block diagram of another exemplary motion vectorreconstruction module for implementing the method according to thefourth embodiment of the present invention;

FIG. 12 is a block diagram of a video decoder;

FIG. 13 is a block diagram of an exemplary motion vector reconstructionmodule according to the present invention;

FIG. 14 is a block diagram of another exemplary motion vectorreconstruction module for implementing the method according to thefourth embodiment of the present invention;

FIG. 15 is a schematic diagram illustrating a bitstream structure;

FIG. 16 is a diagram illustrating the detailed structure of each groupof pictures (GOP) field; and

FIG. 17 is a diagram illustrating the detailed structure of a motionvector (MV) field.

DETAILED DESCRIPTION OF THE INVENTION

The present invention presents a method for constructing a base layer insuch a way as to minimize distortion when only the base layer is used,and a method for quantizing an enhancement layer in such a way as tominimize overhead when all layers are used.

The present invention will now be described more fully with reference tothe accompanying drawings, in which exemplary embodiments of thisinvention are shown. Advantages and features of the present inventionand methods of accomplishing the same may be understood more readily byreference to the following detailed description of exemplary embodimentsand the accompanying drawings. The present invention may, however, beembodied in many different forms and should not be construed as beinglimited to the embodiments set forth herein. Rather, these embodimentsare provided so that this disclosure will be thorough and complete andwill fully convey the concept of the invention to those skilled in theart, and the present invention will only be defined by the appendedclaims. Like reference numerals refer to like elements throughout thespecification.

FIG. 1 shows an example in which one motion vector is divided into threemotion vector components. Referring to FIG. 1, after finding a motionvector A with the predetermined pixel accuracy, the motion vector A isreconstructed as the sum of a base layer motion vector component B, afirst enhancement layer motion vector component E1, and a secondenhancement layer motion vector component E2. A motion vector obtainedas a result of a motion vector search with the predetermined pixelaccuracy as described above is defined as an “actual motion vector”.

Pixel accuracy used for the highest enhancement layer can be typicallyselected as the predetermined pixel accuracy. The motion vectors of therespective layers have different pixel accuracies that increase in anorder from the lowest (close to a base layer) to the highest (away fromthe base layer). For example, the base layer has one pixel accuracy, thefirst enhancement layer has a half pixel accuracy, and the secondenhancement layer has a quarter pixel accuracy.

An encoder transmits the reconstructed motion vector to a predecoderthat truncates a part of the motion vector in an order from the highestto the lowest layers while a decoder receives the remaining part of themotion vector. By performing this process it is possible to implementscalability for a motion vector (motion scalability).

For example, an encoder may transmit motion vector components of alllayers (the base layer, the first enhancement layer, and the secondenhancement layer) while the predecoder may transmit only components ofthe base layer and the first enhancement layer to the decoder bytruncating a component of the second enhancement layer when itdetermines according to available communication conditions thattransmission of all the motion vector components is unsuitable. Thedecoder uses the components of the base layer and the first enhancementlayer to reconstruct a motion vector.

The base layer is essential motion vector information having the highestpriority and it cannot be omitted during transmission. Thus, a bitratein the base layer must be equal to or less than the minimum bandwidthsupported by a network. The bitrate in transmission of all the layers(the base layer and the first and second enhancement layers) must beequal to or less than the maximum bandwidth.

Method for Constructing the Base Layer

The present invention proposes methods for constructing a base layeraccording to first through third embodiments and verifies the methodsthrough experiments.

In each embodiment, a motion vector is constructed into multiple layers:a motion vector component of the base layer represented withinteger-pixel accuracy, and motion vector components of enhancementlayers respectively represented with half- and quarter-pixel accuracy.

The base layer uses an integer to represent a motion vector component,and the enhancement layers use a symbol of 1, −1, or 0 instead of a realnumber in order to represent motion vector components in a simple way.While a motion vector is usually represented by a pair of x, and ycomponents, only one component will be described throughout thisspecification for clarity of explanation.

For example, while the motion vector component of the first enhancementlayer with half pixel accuracy may have a value of −0.5, 0.5, or 0, itis represented by the symbol −1, 1, or 0. Similarly, when the motionvector component of the second enhancement layer with quarter pixelaccuracy may have a value of −0.25, 0.25, or 0, it is represented by thesymbol −1, 1, or 0.

Since a motion vector of the base layer is represented by an integerpart, there is a close spatial correlation between motion vectors in thebase layer. Thus, after considering this spatial correlation andobtaining a predicted value of a current block from the integer motionvectors of neighboring blocks, only a residual between an actual motionvector of the current block and the predicted value is encoded andtransmitted. Conversely, the enhancement layers are usually encodedwithout considering neighboring blocks because there is little spatialcorrelation between motion vectors.

One of the most important goals in implementing motion scalability is toprevent significant degradation in coding performance when anenhancement layer is truncated. When the truncation of the enhancementlayer increases a motion vector error, thereby significantly degradingthe quality of the video reconstructed by a decoder, this will alsoreduce the effect of improving video quality by allocating more bits totexture information due to the reduction of motion vector bits.Therefore, the first through third embodiments of the present inventionare focused on preventing a significant drop in the peak signal-to-noiseratio (PSNR) when only a base layer is used, compared to when a baselayer and enhancement layers are used.

In a first embodiment of the present invention, a method for improvingthe compression efficiency of a motion vector using a spatialcorrelation of the base layer is proposed. According to the firstembodiment, the decimal part of an actual value is rounded up or down sothat the resultant value is closer to a value predicted from the motionvector components of the neighboring blocks in the base layer. FIG. 2shows an example of predicting a motion vector in first and secondenhancement layers from a motion vector in a base layer. Referring toFIG. 2, when a value predicted from neighboring blocks in the base layeris −1 and an actual motion vector value is 0.75, the actual motionvector value is rounded down to 0, which is closer to the predictedvalue of −1, and then motion vector value of 1 in the first and secondenhancement layers are predicted from the motion vector value of 0 inthe base layer.

FIG. 3 illustrates an example of obtaining a predicted value for acurrent block by its correlation with neighboring blocks. Referring toFIG. 3, when motion vectors in a base layer are determined in thediagonal direction, a predicted value of a current block (a) is obtainedby correlation with neighboring blocks (b), (c), and (d), whose motionvectors have been determined. The predicted value may be the median oraverage value of the motion vectors of the neighboring blocks (b), (c),and (d). In the first embodiment, as shown in FIG. 3, an integer valueof the current block (a) is found to be closer to a predicted valueobtained from neighboring blocks.

According to the first embodiment, since a motion vector component ofthe base layer is quantized using a residual between the actual valueand the predicted value obtained from the neighboring blocks, it ispossible to represent the motion vector component of the base layer bythe integer value closest to the predicted value, thereby mostefficiently quantizing the base layer. As such, this method is efficientin reducing the size of a base layer.

A feature of a second embodiment of the present invention is that aninteger motion vector component of a base layer is as close to zero aspossible. In the second embodiment, to make the motion vector componentof the base layer as close to zero as possible, an actual motion vectoris separated into sign and magnitude. The magnitude of the motion vectoris represented using an unsigned integer and the original sign is thenattached to the unsigned integer. This method makes probable that themotion vector component of the base layer is zero, which enables moreefficient quantization since most quantization modules quantize zerosvery efficiently. This method is expressed by Equation (1):x _(b)=sign(x)└|x|┘  (1)

-   -   where sign(x) denotes a signal function that returns values of 1        and −1 when x is a positive value and a negative value,        respectively, |x| denotes the absolute value of variable x, and        └x┘ denotes a function giving the largest integer not exceeding        x (by stripping the decimal part).

Table 1 shows examples of values for each layer that can be obtainedwith the values x and x_(b) in Equation (1). For convenience ofexplanation, the values x and x_(b) are multiplied by a factor of 4 andexpressed as integer values, and 4(x−x_(b)) in the lowest row denotes anerror between an actual value and an integer motion vector of the baselayer. E1 and E2 respectively denote motion vector components of thefirst and second enhancement layers, expressed as symbols. TABLE 1 4χ −7−6 −5 −4  −3 −2 −1 0 1 2 3 4 5 6 7 4χ_(b) −4 −4 −4 −4  0  0  0 0 0 0 0 44 4 4 E1 −1 −1  0 0 −1 −1  0 0 0 1 1 0 0 1 1 E2 −1  0 −1 0  1  0 −1 0 10 −1  0 1 0 −1  4(χ− χ_(b)) −3 −2 −1 0 −3 −2 −1 0 1 2 3 0 1 2 3

As is evident from Table 1, the method of the second embodiment provideshigher possibility that the integer, motion vector component x_(b) ofthe base layer has more zeros, thereby increasing the compressionefficiency as compared to the first embodiment in which x_(b) isobtained by simply truncating the decimal part (x_(b)=└x┘). However,like in the first embodiment, motion vector components of the first andsecond enhancement layers are expressed as the symbols −1, 0, or 1,which results in reduced efficiency. Furthermore, like the firstembodiment, the second embodiment suffers from a significant distortioncaused by a difference—as much as 0.75—between actual and quantizedmotion vectors even when only the base layer is used.

In a third embodiment of the present invention, the difference betweenan actual motion vector and a quantized motion vector of a base layer isminimized. That is, the third embodiment concentrates on reducing thatdifference to less than 0.5, which is an improvement over the first andsecond embodiments where the maximum difference is 0.75. This isaccomplished by modifying the second embodiment to some extent. That is,an integer nearest to an actual motion vector is selected as a motionvector component of the base layer by rounding off the actual motionvector, as defined by Equation (2):x _(b)=sign(x)└|x|+0.5┘  (2)

Equation (2) is similar to Equation (1) except for the use of roundingoff. FIG. 4 shows an example in which a motion vector with a value of0.75 is represented according to the third embodiment of the presentinvention. Referring to FIG. 4, unlike the first and second embodiments,the value 1 is selected as a motion vector component of a base layersince 1 is an integer nearest to the actual motion vector of 0.75. Asshown in FIG. 4, a motion vector component of the first enhancementlayer that minimizes the difference between the actual motion vector andthe motion vector of the first enhancement layer may be −0.5 or 0 (amotion vector of the first enhancement layer is sum of a motion vectorof the base layer and a motion vector component of the first enhancementlayer).

In either case, the minimum difference is 0.25. When two or more valueswith the minimum error are present in the first enhancement layer, thevalue closest to the motion vector component of the immediately lowerlayer is chosen as the motion vector component of the first enhancementlayer.

Thus, the value 0 is finally selected as the motion vector component ofthe first enhancement layer.

By doing so, the difference between the actual motion vector and themotion vector component of the base layer can be reduced to 0.25. Thethird embodiment of the present invention provides improved codingperformance when only a base layer is used by limiting the difference tobelow 0.5. However, this method has the drawback of increasing the sizeof the base layer over the first or second embodiments. Table 2 showsexamples of values that can be created by Equation (2). TABLE 2 4χ −7 −6−5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 4χ_(b) −8 −8 −4 −4 −4 −4 0 0 0 4 4 4 4 88 E1 0 1 0 0 0 1 0 0 0 −1 0 0 0 −1 0 E2 1 0 −1 0 1 0 −1 0 1 0 −1 0 1 0−1 4(χ− χ_(b)) 1 2 −1 0 1 2 −1 0 1 2 −1 0 1 2 −1

As is evident from Table (2), in the third embodiment there is a higherprobability that the motion vector component E1 of the first enhancementlayer will be zero, which results in higher compression efficiency.However, the motion vector component E2 of the second enhancement layeris more complicated so more bits are allocated for coding. Inparticular, 4(x−x_(b)) in the lowest row indicates that the differencebetween the motion vector component of the base layer and the actualmotion vector is less than 0.5.

Table 3 shows the results of experiments where a Foreman CIF sequence iscompressed at frame rate of 30 Hz and at bitrate of 256 Kbps. Theexperiments were done to verify the performance of the first throughthird embodiments of the present invention. Table 3 lists the number ofbits (hereinafter “size” will refer to “number of bits”) needed formotion vectors of a base layer and first and second enhancement layersaccording to the first through third embodiments. TABLE 3 Firstembodiment Second embodiment Third embodiment Base 42.76 45.35 48.12 E120.87 21.56 13.20 E2 24.08 24.14 24.12 Total 87.71 91.05 85.44

As evident from Table 3, a base layer has the smallest size in the firstembodiment, but the first and second enhancement layers have the largestsize since a motion vector of a base layer is predicted, thus increasingthe total size. While attempting to reduce the size of a motion vectorcomponent of a base layer by assigning more zeros to it, the secondembodiment increases the size of a base layer as well as a total sizecompared to the first embodiment. The total size is the largest in thesecond embodiment.

In the third embodiment, the base layer has the largest size but thefirst enhancement layer has the smallest size since it is highlyprobable that a motion vector component of the first enhancement layerwill have a value of zero. The second enhancement layer has a sizesimilar to its counterparts in the first and second embodiments.

When only the base layer is used for coding, it is advantageous toselect a method where the base layer has the smallest size. When alllayers are used for coding, a method that minimizes the total size maybe selected. In the former case, the first embodiment is selected, andin the latter case the third embodiment is selected.

FIG. 5 is a graph illustrating the results of measuring PSNRs (as avideo quality indicator) using motion vectors from the three layersaccording to the first through third embodiments of the presentinvention as detailed in Table 3. Referring to FIG. 5, the thirdembodiment exhibits the highest performance while the first embodimentexhibits the poorest performance.

In particular, the first embodiment has similar performance to thesecond embodiment when only a base layer is used while it has weakperformance compared to the other embodiments when all motion vectorlayers are used.

It should be especially noted that the third embodiment exhibitssuperior performance when only the base layer is used. Specifically, thePSNR value in the third embodiment is more than 1.0 dB higher than thatof the second embodiment. This is achieved by minimizing the differencebetween an integer motion vector component of the base layer and anactual motion vector. That is, since it is more efficient for codingperformance to minimize this difference than to slightly decrease aninteger value, the third embodiment exhibits the best performance.

Method of Efficiently Compressing the Enhancement Layer

Referring to Table 3, the third embodiment is superior to the first andsecond embodiments in terms of the size of the first enhancement layer,but it has little difference in terms of the size of the secondenhancement layer. Thus, for low bitrate coding where the size of themotion vector largely affects the performance, the third embodiment isnot advantageous over the others when all motion vector layers are used.

FIG. 6A is a graph illustrating an experimental result of compressing aForeman CIF sequence at 100 Kbps according to the third embodiment.

As evident from FIG. 6A, since the 100 kbps is a low bitrate, the thirdembodiment exhibits superior performance when only the base layer isused, compared to when all the layers are used. Specifically, while thethird embodiment shows excellent performance when the base layer or acombination of the base layer and the first enhancement layer is used,its performance degrades when all the layers are used since the size ofthe second enhancement layer is large.

However, the third embodiment is intended to allocate a large amount ofinformation to the second enhancement layer. Since the secondenhancement layer is used only for a sufficient bitrate, its large sizedoes not significantly affect performance. For a low bitrate, only thebase layer and the first enhancement layer are used, and bits in thesecond enhancement layer can be truncated.

In order to prevent significant degradation due to the presence of thesecond enhancement layer in the third embodiment, the present inventionproposes a method for providing excellent coding performance when allmotion vector layers are used by adding two compression rules.

The two compression rules are found in Table 2. Referring to Table 2,the first rule is that the motion vector component (4x_(b)) of the baselayer has an opposite sign to the motion vector component E1 of thefirst enhancement layer except, of course, when E1 is zero. In otherwords, the motion vector component E1 of the first enhancement layer isrepresented by 0 or 1, and when E1 is 1, a decoder reconstructs theoriginal value of E1 by attaching a sign to E1, which is opposite to thesign of the motion vector component of the base layer.

That is, since E1 has an opposite sign to the motion vector component ofthe base layer (except zero, which has no sign), E1 can be expressed aseither 0 or 1. An encoder converts −1 to 1 while a decoder canreconstruct the original value of E1 by attaching the opposite sign to1.

By applying the first rule, entropy coding efficiency can be improvedsince the motion vector component E1 of the first enhancement layer canbe expressed as either 0 or 1. An experimental result demonstrated thatapplying the first rule alone reduces the number of bits by more than12%.

Referring to Table 2, the second compression rule is that the motionvector component E2 of the second enhancement layer is always 0 when E1is 1 or −1. Thus, E2 is not encoded when a corresponding E1 is not 0.

In other words, an encoder does not encode E2 when E1 is not 0. Adecoder uses 0 as E2 when E1 is not 0, and the received value as E2 whenE1 is 0.

An experimental result demonstrated that applying the second rulereduces the number of bits by about 25% and by about 12% after entropyencoding. This compensates for the drawback of the third embodimentcaused by the large second enhancement layer. Table 4 shows the valuesof Table 2 after applying the first and second compression rules. TABLE4 4χ −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 4χ_(b) −8 −8 −4 −4 −4 −4 0 0 04 4 4 4 8 8 E1 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 E2 1 X −1 0 1 X −1 0 1 X −10 1 X −1

The symbol “X” in Table 4 denotes a portion not transmitted, and thisconstitutes a quarter of the total number of cases. Thus, the number ofbits can be reduced by 25%. By converting −1 to 1 in the firstenhancement layer, compression efficiency can be further increased. Amethod created by applying the first and second compression rules to thethird embodiment is referred to as a ‘fourth embodiment’. Thecompression rules in the fourth embodiment can also be applied to a baselayer, a first enhancement layer, and a second enhancement layer for amotion vector consisting of four or more layers. Furthermore, either thefirst or second or both rules can be applied depending on the type ofapplication.

Table 5 shows the number of bits needed for motion vectors of a baselayer, a first enhancement layer, and a second enhancement layeraccording to the fourth embodiment of the present invention. TABLE 5Third embodiment Fourth embodiment Reduction rate (%) Base 48.12 48.120   E1 13.20 11.13 15.68 E2 24.12 21.25 11.90 Total 85.44 80.50 5.8

As detailed in Table 5, the fourth embodiment reduces the sizes of thefirst and second enhancement layers by 15.68% and 11.90% compared to thethird embodiment, thereby significantly reducing the overall bitrate.The number of bits in the second enhancement layer is reduced by lessthan 25% since the value of the omitted bits is zero and are efficientlycompressed by an entropy encoding module.

Nevertheless, the number of bits can be reduced by approximately 12%.FIG. 6B is a graph comparing the experimental results of the thirdembodiment (FIG. 6A) and the fourth embodiment of the present invention.As shown in FIG. 6B, the fourth embodiment exhibits similar performanceto the third embodiment when only the base layer is used, but exhibitssuperior performance thereto when all the layers are used.

While it is described above that a motion vector consists of threelayers, it will be understood by those skilled in the art that thepresent invention can apply to a motion vector consisting of more thanthree layers. Furthermore, it is described above that a motion vectorsearch is performed on a base layer with 1 pixel accuracy, a firstenhancement layer with ½ pixel accuracy, and a second enhancement layerwith ¼ pixel accuracy. However, this is provided as an example only, andit will be readily apparent to those skilled in the art that the motionvector search may be performed with different pixel accuracies thanthose stated above. Although, the pixel accuracies increase with eachlayer, in a manner similar to the afore-mentioned embodiments.

In order to implement motion scalability, an encoder encodes an inputvideo using a multilayered motion vector while a predecoder or a decoderdecodes all or part of the input video. The overall process will now bedescribed schematically with reference to FIG. 7.

FIG. 7 shows the overall configuration of a video coding system.Referring to FIG. 7, the video coding system includes an encoder 100, apredecoder 200, and a decoder 300. The encoder 100 encodes an inputvideo into a bitstream 20. The predecoder 200 truncates part of thetexture data in the bitstream 20 according to extraction conditions suchas bitrate, resolution or frame rate determined considering thecommunication environment. The decoder 300, therefore, implementsscalability for the texture data. The predecoder 200 also implementsmotion scalability by truncating part of the motion data in thebitstream 20 in an order from the highest to the lowest layers accordingto the communication environment or the number of texture bits. Byimplementing texture or motion scalability in this way, the predecodercan extract various bitstreams 25 from the original bitstream 20.

The decoder 300 generates an output video 30 from the extractedbitstream 25. Of course, either the predecoder 200 or the decoder 300 orboth may extract the bitstream 25 according to the extractionconditions.

FIG. 8 is a block diagram of an encoder 100 of a video coding system.The encoder 100 includes a partitioning module 110, a motion vectorreconstruction module 120, a temporal filtering module 130, a spatialtransform module 140, a quantization module 150, and an entropy encodingmodule 160.

The partitioning module 110 partitions an input video 10 into severalgroups of pictures (GOPs), each of which is independently encoded as aunit.

The motion vector reconstruction module 120 finds an actual motionvector for a frame of one GOP with the predetermined pixel accuracy, andsends the motion vector to the temporal filtering module 130. The motionvector reconstruction module 120 uses this actual motion vector and apredetermined method (one of first through third embodiments) todetermine a motion vector component of the base layer. Next, itdetermines a motion vector component of an enhancement layer with theenhancement layer pixel accuracy that is closer to the actual motionvector. The motion vector reconstruction module 120 also sends aninteger motion vector component of the base layer and a symbol valuethat is the motion vector component of the enhancement layer to theentropy encoding module 160. The multilayered motion information isencoded by the entropy encoding module 160 using a predeterminedencoding algorithm.

FIG. 9 is a block diagram of an exemplary motion vector reconstructionmodule 120 according to the present invention. Referring to FIG. 9, themotion vector reconstruction module 120 includes a motion vector searchmodule 121, a base layer determining module 122, and an enhancementlayer determining module 123.

Referring to FIG. 11, in order to implement the afore-mentioned fourthembodiment of the present invention, the motion vector reconstructionmodule 120 further includes an enhancement layer compression module 125with either a first or second compression module 126 or 127 or both.

The motion vector search module 121 performs a motion vector search ofeach block in a current frame (at a predetermined pixel accuracy) inorder to obtain an actual motion vector. The block may be a fixedvariable size block. When a variable size block is used, informationabout the block size (or mode) needs to be transmitted together with theactual motion vector.

In general, to accomplish a motion vector search, a current image frameis partitioned into blocks of a predetermined pixel size, and a block ina reference image frame is compared with the corresponding block in thecurrent image frame according to the predetermined pixel accuracy inorder to derive the difference between the two blocks. A motion vectorthat gives the minimum sum of errors is designated as the motion vectorfor the current block. A search range may be predefined usingparameters. A smaller search range reduces search time and exhibits goodperformance when a motion vector exists within the search range.However, the prediction accuracy will be decreased for a fast-motionimage where the motion vector does not exist within the range.

Motion estimation may be performed using variable size blocks instead ofthe above fixed-size block. In motion estimation using a variable sizeblock, a motion vector search is performed on blocks of variable pixelsizes to determine a variable block size and a motion vector thatminimize a predetermined cost function J.

The cost function is defined by Equation (3):J=D+λ×R  (3)where D is the number of bits used for coding a frame difference, R isthe number of bits used for coding an estimated motion vector, and λ isa Lagrangian coefficient.

The base layer determining module 122 determines an integer motionvector component of a base layer according to the first through thirdembodiments. In the first embodiment, it determines the motion vectorcomponent of the base layer by spatial correlation with the motionvector components of neighboring blocks and rounding up or down thedecimal part of the actual motion vector.

In the second embodiment, the base layer determining module 122determines the motion vector component of the base layer by separatingthe actual motion vector into a sign and a magnitude. The magnitude ofthe motion vector is represented by an unsigned integer to which theoriginal sign is attached. The determination process is shown inEquation (1).

In the third embodiment, the base layer determining module 122determines the motion vector component of the base layer by finding aninteger value nearest to the actual motion vector. This nearest integervalue is calculated by Equation (2).

The enhancement layer determining module 123 determines a motion vectorcomponent of an enhancement layer in such a way as to minimize an errorbetween the actual motion vector and the motion vector component. Whentwo or more vectors with the same error exist, the motion vector thatminimizes the error of the motion vector in the immediately lower layeris chosen as the motion vector component of the enhancement layer.

For example, when a motion vector consists of four layers as shown inFIG. 10, a motion vector component of a base layer is determinedaccording to the first through third embodiments and motion vectorcomponents of the first through third enhancement layers are determinedusing a separate method. Assuming that the value 1 is determined as themotion vector component of the base layer according to one of the firstthrough third embodiments, a process for determining the motion vectorcomponents of the enhancement layers will now be described withreference to FIG. 10. Here, a “cumulative value” of a layer is definedas the sum of motion vector components of the lower layers.

Referring to FIG. 10, when the cumulative value of the first enhancementlayer is set to 0.5 as it is the closest value to 0.625, −0.5 isdetermined to be the motion vector component of the first enhancementlayer. Two cumulative values 0.5 and 0.75, having the same errorrelative to 0.625, exist in the second enhancement layer, but 0.5 isselected since it is closer to the cumulative value of the firstenhancement layer. Thus, 0 is determined as a motion vector component ofthe second enhancement layer, and then 0.125 is determined as the motionvector component of the third enhancement layer.

In order to implement the aforementioned method according to the fourthembodiment of the present invention, the motion vector reconstructionmodule 120 further includes the enhancement layer compression module 125with either the first or second compression module 126 or 127 or both asshown in FIG. 11.

When the motion vector component of the first enhancement layer is anegative number, the first compression module 126 converts the negativenumber into a positive number having the same magnitude. When the motionvector component of the first enhancement layer is not 0, the secondcompression module 127 does not encode the motion vector component ofthe second enhancement layer.

Referring to FIG. 8, to reduce temporal redundancies, the temporalfiltering module 130 uses motion vectors obtained by the motion vectorreconstruction module 121 to decompose frames into low-pass andhigh-pass frames in the direction of a temporal axis. A temporalfiltering algorithm such as Motion Compensated Temporal Filtering (MCTF)or Unconstrained MCTF (UMCTF) can be used.

The spatial transform module 140 removes spatial redundancies from theseframes using the discrete cosine transform (DCT) or wavelet transform,and creates transform coefficients.

The quantization module 150 quantizes those transform coefficients.Quantization is the process of converting real transform coefficientsinto discrete values and mapping the quantized coefficients intoquantization indices. In particular, when a wavelet transform is usedfor spatial transformation, embedded quantization can often be used.Embedded ZeroTrees Wavelet (EZW), Set Partitioning in Hierarchical Trees(SPIHT), and Embedded ZeroBlock Coding (EZBC) are examples of anembedded quantization algorithm.

The entropy encoding module 160 losslessly encodes the transformcoefficients quantized by the quantization module 150 and the motioninformation generated by the motion vector reconstruction module 120into a bitstream 20. For entropy encoding, various techniques such asarithmetic encoding and variable-length encoding may be used.

FIG. 12 is a block diagram of a decoder 300 in a video coding systemaccording to an embodiment of the present invention.

The decoder 300 includes an entropy decoding module 310, an inversequantization module 320, an inverse spatial transform module 330, aninverse temporal filtering module 340, and a motion vectorreconstruction module 350.

The entropy decoding module 310 performs the inverse of an entropyencoding process to extract texture information (encoded frame data) andmotion information from the bitstream 20.

FIG. 13 is a block diagram of an exemplary motion vector reconstructionmodule 350 according to the present invention. The motion vectorreconstruction module 350 includes a layer reconstruction module 351 anda motion addition module 352.

The layer reconstruction module 351 interprets the extracted motioninformation and recognizes motion information for each layer. The motioninformation contains block information and motion vector information foreach layer. The layer reconstruction module 351 then reconstructs amotion vector component of each layer from a corresponding layer valuecontained in the motion information. Here, the “layer value” means avalue received from the encoder. Specifically, an integer valuerepresenting a motion vector component of a base layer or a symbol valuerepresenting a motion vector component of an enhancement layer. When thelayer value is a symbol value, the layer reconstruction module 351reconstructs the original motion vector component from the symbol value.

The motion addition module 352 reconstructs a motion vector by addingthe motion vector components of the base layer and the enhancement layertogether and sending the motion vector to the inverse temporal filteringmodule 340.

FIG. 14 is a block diagram of another exemplary motion vectorreconstruction module 350 for implementing the method according to thefourth embodiment of the present invention.

Referring to FIG. 14, the motion vector reconstruction module 350includes a layer reconstruction module 351, a motion addition module352, and an enhancement layer reconstruction module 353 with eitherfirst or second reconstruction modules 354 and 355 or both.

In order to reconstruct a motion vector component of a first enhancementlayer when a value of the extracted information of the first enhancementlayer is not 0, the first reconstruction module 354 attaches a sign tothis value that is opposite to the sign of a motion vector component ofa base layer, and obtains a motion vector component corresponding to theresultant value (symbol). When the value of the extracted information ofthe first enhancement layer is 0, the motion vector component is 0.

In order to reconstruct a motion vector component of a secondenhancement layer, the second reconstruction module 355 sets the valueof motion vector component of the second enhancement layer to 0 when thevalue of the first enhancement layer is not 0. When the value is 0, thesecond reconstruction module obtains a motion vector componentcorresponding to a value of the second enhancement layer. Then, themotion addition module 352 reconstructs a motion vector by adding themotion vector components of the base layer and the first and secondenhancement layers together.

The inverse quantization module 320 performs inverse quantization on theextracted texture information and outputs transform coefficients.Inverse quantization is the process of obtaining quantized coefficientsfrom quantization indices received from the encoder 100. A mapping tableof indices and quantization coefficients is received from the encoder100.

The inverse of spatial transform, the inverse spatial transform module330 inverse-transforms the transform coefficients into transformcoefficients in a spatial domain. For example, in the DCT transform thetransform coefficients are inverse-transformed from the frequency domainto the spatial domain. In the wavelet transform, the transformcoefficients are inversely transformed from the wavelet domain to thespatial domain.

The inverse temporal filtering module 340 performs inverse temporalfiltering on the transform coefficients in the spatial domain (i.e., atemporal residual image) using the reconstructed motion vectors receivedfrom the motion vector reconstruction module 350 in order to reconstructframes making up a video sequence.

The term “module”, as used herein refers to, but is not limited to, asoftware or hardware component such as a Field Programmable Gate Array(FPGA) or an Application Specific Integrated Circuit (ASIC), whichperforms certain tasks. A module may advantageously be configured toreside on the addressable storage medium and to execute on one or moreprocessors. Thus, a module may include, by way of example, componentssuch as software components, object-oriented software components, classcomponents and task components, processes, functions, attributes,procedures, subroutines, segments of program code, drivers, firmware,microcode, circuitry, data, databases, data structures, tables, arrays,or variables. The functionality provided for in the components andmodules may be combined into fewer components and modules or furtherseparated into additional components and modules. In addition, thecomponents and modules may be implemented such a way that they executeone or more computers in a communication system.

FIGS. 15 through 17 illustrate a structure of a bitstream 400.Specifically, FIG. 15 is a schematic diagram illustrating an overallstructure of the bitstream 400.

The bitstream 400 is composed of a sequence header field 410 and a datafield 420 containing a plurality of GOP fields 430 through 450.

The sequence header field 410 specifies image properties such as framewidth (2 bytes) and height (2 bytes), a GOP size (I byte), and a framerate (1 byte).

The data field 420 contains all the image information and otherinformation (motion vector, reference frame number, etc.) needed toreconstruct an image.

FIG. 16 shows the detailed structure of each GOP field 430. Referring toFIG. 16, the GOP field 430 consists of a GOP header 460, a T₍₀₎ field470 that specifies information about a first frame (encoded withoutreference to another frame) and that has been subjected to temporalfiltering, a motion vector (MV) field 480 specifying a set of motionvectors, and a “the other T” field 490 specifying information on framesother than the first frame (encoded with reference to another frame).

Unlike the sequence header field 410 that specifies properties of theentire video sequence, the GOP header field 460 specifies imageproperties of a GOP such as temporal filtering order.

FIG. 17 shows the detailed structure of the MV field 480 consisting ofMV₍₁₎ through MV_((n-1)) fields.

Referring to FIG. 17, each of the MV₍₁₎ through MV_((n-1)) fieldsspecifies variable size block information such as size and position ofeach variable size block and motion vector information (symbolsrepresenting motion vector components) for each layer.

In concluding the detailed description, those skilled in the art willappreciate that many variations and modifications can be made to theexemplary embodiments without substantially departing from theprinciples of the present invention. Therefore, the disclosed exemplaryembodiments of the invention are used in a generic and descriptive senseonly and not for purposes of limitation.

The present invention reduces the size of an enhancement layer whileminimizing an error in a base layer. The present invention also enablesadaptive allocation of the amount of bits between motion information andtexture information using motion scalability.

1. An apparatus for reconstructing a motion vector obtained at apredetermined pixel accuracy, the apparatus comprising: a base layerdetermining module determining a motion vector component of a base layerusing the obtained motion vector according to a pixel accuracy of thebase layer; and an enhancement layer determining module determining amotion vector component of an enhancement layer according to a pixelaccuracy of the enhancement layer, so that a sum of the motion vectorcomponent of the enhancement layer and the motion vector component ofthe base layer is close to the obtained motion vector.
 2. The apparatusof claim 1, wherein the base layer determining module determines themotion vector component of the base layer that is close to a valuepredicted from motion vectors of neighboring blocks according to thepixel accuracy of the base layer.
 3. The apparatus of claim 1, whereinin order to determine the motion vector component of the base layeraccording to the pixel accuracy of the base layer, the base layerdetermining module separates the obtained motion vector into an originalsign and a magnitude, uses an unsigned value to represent the magnitudeof the motion vector, and attaches the original sign to the unsignedvalue.
 4. The apparatus of claim 1, wherein the base layer determiningmodule determines a value closest to the obtained motion vector as themotion vector component of the base layer according to the pixelaccuracy of the base layer.
 5. The apparatus of claim 4, wherein themotion vector component of the base layer is x_(b) and is determinedusing x_(b)=sign(x)└|x|+0.5┘, where sign(x) denotes a signal functionthat returns values of 1 and −1 when variable x is a positive value anda negative value, respectively, |x| denotes an absolute value functionwith respect to the variable x, and └|x|+0.5┘ denotes a function thatgives a largest integer not exceeding |x|+0.5 by stripping a decimalpart.
 6. The apparatus of claim 4, further comprising a firstcompression module removing redundancy in a motion vector component of afirst enhancement layer using a first relationship wherein a sign of themotion vector component of the first enhancement layer is the oppositeto a sign of the motion vector component of the base layer when themotion vector component of the first enhancement layer is not
 0. 7. Theapparatus of claim 6, further comprising a second compression moduleremoving redundancy in a motion vector component of a second enhancementlayer using a second relationship wherein the motion vector component ofthe second enhancement layer is always 0 when the motion vectorcomponent of the first enhancement layer is not
 0. 8. A video encoderusing a motion vector consisting of multiple layers, the encodercomprising: a motion vector reconstruction module including a motionvector search module obtaining the motion vector with a predeterminedpixel accuracy, a base layer determining module determining a motionvector component of a base layer using the obtained motion vectoraccording to a pixel accuracy of the base layer; an enhancement layerdetermining module determining a motion vector component of anenhancement layer so that a sum of the motion vector component of theenhancement layer and the motion vector component of the base layer isclose to the obtained motion vector according to a pixel accuracy of theenhancement layer; a temporal filtering module removing temporalredundancies by filtering frames in a direction of a temporal axis usingthe obtained motion vector; a spatial transform module removing spatialredundancies from the filtered frames from which the temporalredundancies have been removed and creating transform coefficients; anda quantization module performing quantization on the transformcoefficients.
 9. An apparatus for reconstructing a motion vectorconsisting of a base layer and at least one enhancement layer, theapparatus comprising: a layer reconstruction module reconstructing amotion vector component of the base layer and a motion vector componentof the at least one enhancement layer from a value of the base layer anda value of the at least one enhancement layer, respectively, the valuesof the base layer and the at least one enhancement layer beinginterpreted from an input bitstream; and a motion addition module addingthe reconstructed motion vector components of the base layer and the atleast one enhancement layer together and providing the motion vector.10. An apparatus for reconstructing a motion vector consisting of a baselayer and at least one enhancement layer, the apparatus comprising: afirst reconstruction module reconstructing a motion vector component ofa first enhancement layer by attaching a sign to a value of the firstenhancement layer interpreted from an input bitstream, which is oppositeto a sign of a corresponding value of the base layer; a layerreconstruction module reconstructing a motion vector component of thebase layer and a motion vector component of at least one enhancementlayer other than the first enhancement layer from the correspondingvalue of the base layer and a value of the at least one enhancementlayer other than the first enhancement layer, respectively, thecorresponding value of the base layer and the value of the at least oneenhancement layer other than the first enhancement layer beinginterpreted from the input bitstream; and a motion addition moduleadding the reconstructed motion vector components of the base layer, thefirst enhancement layer, and the least one enhancement layer other thanthe first enhancement layer together and providing the motion vector.11. An apparatus for reconstructing a motion vector consisting of a baselayer and at least one enhancement layer, the apparatus comprising: afirst reconstruction module reconstructing a motion vector component ofa first enhancement layer by attaching a sign to a value of the firstenhancement layer interpreted from an input bitstream, which is oppositeto a sign of a corresponding value of the base layer; a secondreconstruction module setting a motion vector component of a secondenhancement layer to 0 when the value of the first enhancement layer isnot 0 and reconstructing the motion vector component of the secondenhancement layer from a value of the second enhancement layerinterpreted from the input bitstream when the value of the firstenhancement layer is 0; a layer reconstruction module reconstructing amotion vector component of the base layer and a motion vector componentof a third enhancement layer other than the first and the secondenhancement layers from the corresponding value of the base layer and avalue of the third enhancement layer, respectively, the correspondingvalue of the base layer and the value of the third enhancement layerbeing interpreted from the input bitstream; and a motion addition moduleadding the reconstructed motion vector component of the base layer andthe reconstructed motion vector components of the first, the second, andthe third enhancement layers together and providing the motion vector.12. A video decoder using a motion vector consisting of multiple layers,the decoder comprising: an entropy decoding module interpreting an inputbitstream and extracting texture information and motion information fromthe bitstream; a motion vector reconstruction module reconstructingmotion vector components of the multiple layers from correspondingvalues of the multiple layers contained in the extracted motioninformation and providing the motion vector after adding the motionvector components of the multiple layers together; an inversequantization module applying inverse quantization to the textureinformation and outputting transform coefficients; an inverse spatialtransform module inversely transforming the transform coefficients intotransform coefficients in a spatial domain by performing an inverse of aspatial transform; and an inverse temporal filtering module performinginverse temporal filtering on the inversely transformed transformcoefficients in the spatial domain using the provided motion vector andreconstructing frames in a video sequence.
 13. The decoder of claim 12,wherein the motion vector reconstruction module comprises: a firstreconstruction module reconstructing a motion vector component of afirst enhancement layer by attaching a sign to a value of the firstenhancement layer contained in the motion information, which is oppositeto a sign of a corresponding value of a base layer; a layerreconstruction module reconstructing a motion vector component of thebase layer and a motion vector component of at least one enhancementlayer other than the first enhancement layer from the correspondingvalue of the base layer and a value of the enhancement layer other thanthe first enhancement layer, respectively; and a motion addition moduleadding the reconstructed motion vector components of the base layer, thefirst enhancement layer, and the at least one enhancement other than thefirst enhancement layer together and providing the motion vector. 14.The decoder of claim 12, wherein the motion vector reconstruction modulecomprises: a first reconstruction module reconstructing a motion vectorcomponent of a first enhancement layer by attaching a sign to a value ofthe first enhancement layer contained in the motion information, whichis opposite to a sign of a corresponding value of a base layer; a secondreconstruction module setting a motion vector component of a secondenhancement layer to 0 when the value of the first enhancement layer isnot 0 and reconstructing the motion vector component of the secondenhancement layer from a value of the second enhancement layer containedin the motion information when the value of the first enhancement layeris 0; a layer reconstruction module reconstructing a motion vectorcomponent of the base layer and a motion vector component of at leastone enhancement layer other than the first and second enhancement layersfrom the corresponding value of the base layer and a value of the atleast one enhancement layer other than the first and the secondenhancement layers contained in the motion information, respectively;and a motion addition module adding the reconstructed motion vectorcomponent of the base layer, the first enhancement layer, the secondenhancement layer, and the at least one enhancement layer other than thefirst and the second enhancement layers together and providing themotion vector.
 15. A method for reconstructing a motion vector obtainedat predetermined pixel accuracy, the method comprising: determining amotion vector component of a base layer using the obtained motion vectoraccording to a pixel accuracy of the base layer; and determining amotion vector component of an enhancement layer so that a sum of themotion vector component of the enhancement layer and the motion vectorcomponent of the base layer is close to the obtained motion vectoraccording to a pixel accuracy of the enhancement layer.
 16. The methodof claim 15, wherein in the determining of the motion vector componentof the base layer, the motion vector component of the base layer isdetermined to be close to a value predicted from motion vectors ofneighboring blocks according to the pixel accuracy of the base layer.17. The method of claim 15, wherein in the determining of the motionvector component of the base layer, the motion vector component of thebase layer is determined according to the pixel accuracy of the baselayer by separating the obtained motion vector into an original sign anda magnitude, using an unsigned value to represent the magnitude of themotion vector, and attaching the original sign to the unsigned value.18. The method of claim 15, wherein in the determining of the motionvector component of the base layer, a value closest to the obtainedmotion vector is determined as the motion vector component of the baselayer according to the pixel accuracy of the base layer.
 19. A methodfor reconstructing a motion vector consisting of a base layer and atleast one enhancement layer, the method comprising: reconstructing amotion vector component of the base layer and a motion vector componentof the at least one enhancement layer from a value of the base layer anda value of the at least one enhancement layer, respectively, the valuesof the base layer and the at least one enhancement layer beinginterpreted from an input bitstream; and adding the reconstructed motionvector components of the base layer and the at least one enhancementlayer together and providing the motion vector.
 20. A method forreconstructing a motion vector consisting of a base layer and at leastone enhancement layer, the method comprising: reconstructing a motionvector component of a first enhancement layer by attaching a sign to avalue of the first enhancement layer interpreted from an inputbitstream, which is opposite to a sign of a corresponding value of thebase layer; reconstructing a motion vector component of the base layerand a motion vector component of an least one enhancement layer otherthan the first enhancement layer from the corresponding value of thebase layer and a value of the at least one enhancement layer other thanthe first enhancement layer, respectively, the corresponding value ofthe base layer and the value of the at least one enhancement layer otherthan the first enhancement layer being interpreted from the inputbitstream; and adding the reconstructed motion vector components of thebase layer, the first enhancement layer, and the at least oneenhancement layer other than the first enhancement layer together andproviding the motion vector.
 21. A method for reconstructing a motionvector consisting of a base layer and at least one enhancement layer,the method comprising: reconstructing a motion vector component of afirst enhancement layer by attaching a sign to a value of the firstenhancement layer interpreted from an input bitstream, which is oppositeto a sign of a corresponding value of the base layer; setting a motionvector component of a second enhancement layer to 0 when the value ofthe first enhancement layer is not 0 and reconstructing the motionvector component of the second enhancement layer from a value of thesecond enhancement layer interpreted from the input bitstream when thevalue of the first enhancement layer is 0; reconstructing a motionvector component of the base layer and a motion vector component of atleast one enhancement layer other than the first and the secondenhancement layers from the corresponding value of the base layer and avalue of the at least one enhancement layer other than the first and thesecond enhancement layers, respectively, the corresponding value of thebase layer and the value of the at least one enhancement layer otherthan the first and the second enhancement layers being interpreted fromthe input bitstream; and adding the reconstructed motion vectorcomponents of the base layer, the first enhancement layer, the secondenhancement layer, and the at least one enhancement layer other than thefirst and the second enhancement layers together and providing themotion vector.